Conditional Random Field Based Sentence Context Identification: Enhancing Citation Services for the Research Community
نویسندگان
چکیده
Academic publishers' full text databases are an important part of the deep Web for researchers and a potentially valuable resource for automated extraction of scientific knowledge. Recently, some major publishers have provided Web APIs for accessing their article databases, thus allowing the development of Web applications to mine these resources. However the task of knowledge discovery from academic articles, particularly with citations remains a challenge. We present in this paper our research work taken up for identifying contexts associated with sentences in academic articles and use of this information to provide information services for the research community. To this end, we propose an annotation scheme for sentences in academic articles. We also describe our experiments with conditional random fields for sentence classification. Finally, we present CitContExt – a citation context extraction application developed based on the techniques discussed above .
منابع مشابه
Natural Language Engineering
Scientific literature is an important medium for disseminating scientific knowledge. However, in recent times, a dramatic increase in research output has resulted in challenges for the research community. An increasing need is felt for tools that exploit the full content of an article and provide insightful services with value beyond quantitative measures such as impact factors and citation cou...
متن کاملContextual information retrieval in research articles: Semantic publishing tools for the research community
In recent years, the dramatic increase in academic research publications has gained significant research attention. Research has been carried out exploring novel ways of providing information services using this research content. However, the task of extracting meaningful information from research documents remains a challenge. This paper presents our research work on developing intelligent inf...
متن کاملChunking-based Question Type Identification for Multi-Sentence Queries
This paper describes a technique of question type identification for multi-sentence queries in open domain question-answering. Based on observations of queries in real question-answering services on the Web, we propose a method to decompose a multi-sentence query into question items and to identify their question types. The proposed method is an efficient sentence-chunking based technique by us...
متن کاملConditional Cash Transfers for Maternal Health Interventions: Factors Influencing Uptake in North-Central Nigeria
Background Nigeria accounts for a significant proportion of global maternal mortality figures with little progress made in curbing poor health indices. In a bid to reverse this trend, the Government of Nigeria initiated a conditional cash transfer (CCT) programme to encourage pregnant women utilize services at designated health facilities. This study aims to understand experiences of women who ...
متن کاملCommunity Answer Summarization for Multi-Sentence Question with Group L1 Regularization
We present a novel answer summarization method for community Question Answering services (cQAs) to address the problem of “incomplete answer”, i.e., the “best answer” of a complex multi-sentence question misses valuable information that is contained in other answers. In order to automatically generate a novel and non-redundant community answer summary, we segment the complex original multi-sent...
متن کامل